Contrastive Filtering of Domain-Specific Multi-Word Terms from Different Types of Corpora
نویسندگان
چکیده
In this paper we tackle the challenging task of Multi-word term (MWT) extraction from different types of specialized corpora. Contrastive filtering of previously extracted MWTs results in a considerable increment of acquired domain specific terms.
منابع مشابه
Conceptual Structure of Automatically Extracted Multi-Word Terms from Domain Specific Corpora: a Case Study for Italian
This paper is based on our efforts on automatic multi-word terms extraction and its conceptual structure for multiple languages. At present, we mainly focus on English and the major Romance languages such as French, Spanish, Portuguese, and Italian. This paper is a case study for Italian language. We present how to build automatically conceptual structure of automatically extracted multi-word t...
متن کاملLexical Bundles in English Abstracts of Research Articles Written by Iranian Scholars: Examples from Humanities
This paper investigates a special type of recurrent expressions, lexical bundles, defined as a sequence of three or more words that co-occur frequently in a particular register (Biber et al., 1999). Considering the importance of this group of multi-word sequences in academic prose, this study explores the forms and syntactic structures of three- and four-word bundles in English abstracts writte...
متن کاملMulti-word term extraction from comparable corpora by combining contextual and constituent clues
In this paper we present an approach to automatically extract and align multi-word terms from an English-Slovene comparable health corpus. First, the terms are extracted from the corpus for each language separately using a list of user-adjustable morphosyntactic patterns and a term weighting measure. Then, the extracted terms are aligned in a bag-of-equivalents fashion with a seed bilingual lex...
متن کاملRevising the Compositional Method for Terminology Acquisition from Comparable Corpora
In this paper, we present a new method that improves the alignment of equivalent terms monolingually acquired from bilingual comparable corpora: the Compositional Method with Context-Based Projection (CMCBP). Our overall objective is to identify and to translate high specialized terminology made up of multi-word terms acquired from comparable corpora. Our evaluation in the medical domain and fo...
متن کاملA Comparative and Contrastive Study on the Meaning Extension of Color Terms in Persian and English
We deal with a wide range of colors in our daily life. They are such ubiquitous phenomena that is hard and next to impossible to imagine even a single entity (be it an object, place, living creature, etc) devoid of them. They are like death and tax which nobody can dispense with. This omnipresence of colors around us has also made its way through abstract and less tangible entities via the inte...
متن کامل